discrete step
Reasoning Steps as Curriculum: Using Depth of Thought as a Difficulty Signal for Tuning LLMs
Curriculum learning for training LLMs requires a difficulty signal that aligns with reasoning while remaining scalable and interpretable. We propose a simple premise: tasks that demand deeper depth of thought for humans should also be harder for models. Accordingly, we define difficulty as depth of thought (DoT) and operationalize it by counting the discrete steps in a teacher model's reasoning trace (e.g., Chain-of-Thought). We then train with a shallow to deep curriculum ordered by this DoT and outline how to derive, validate, and schedule it at scale. Our position yields three testable hypotheses: (i) DoT correlates with conventional difficulty on reasoning benchmarks, (ii) DoT-ordered curricula outperform length- or judge-scored curricula under matched budgets, and (iii) the difficulty is robust across teacher models given light formatting controls. We propose an evaluation framework and discuss threats to validity (teacher style, length confounds) alongside practical mitigations. Taken together, we aim to move toward cognitively grounded, interpretable curricula for reasoning-centric training.
GitHub - allenai/tango: Organize your experiments into discrete steps that can be cached and reused throughout the lifetime of your research project.
AI2 Tango replaces messy directories and spreadsheets full of file versions by organizing experiments into discrete steps that can be cached and reused throughout the lifetime of a research project. Even though ai2-tango itself is quite small, installing everything will pull in a lot of dependencies. Don't be surprised if this takes a while! You can build a Docker image suitable for tango projects by using the official Dockerfile as a starting point for your own Dockerfile, or you can simply use one of our prebuilt images as a base image in your Dockerfile. Make sure to choose the right base image for your use case depending on the version of tango you're using and the CUDA version that your host machine supports.
PAGP: A physics-assisted Gaussian process framework with active learning for forward and inverse problems of partial differential equations
Zhang, Jiahao, Zhang, Shiqi, Lin, Guang
In this work, a Gaussian process regression(GPR) model incorporated with given physical information in partial differential equations(PDEs) is developed: physics-assisted Gaussian processes(PAGP). The targets of this model can be divided into two types of problem: finding solutions or discovering unknown coefficients of given PDEs with initial and boundary conditions. We introduce three different models: continuous time, discrete time and hybrid models. The given physical information is integrated into Gaussian process model through our designed GP loss functions. Three types of loss function are provided in this paper based on two different approaches to train the standard GP model. The first part of the paper introduces the continuous time model which treats temporal domain the same as spatial domain. The unknown coefficients in given PDEs can be jointly learned with GP hyper-parameters by minimizing the designed loss function. In the discrete time models, we first choose a time discretization scheme to discretize the temporal domain. Then the PAGP model is applied at each time step together with the scheme to approximate PDE solutions at given test points of final time. To discover unknown coefficients in this setting, observations at two specific time are needed and a mixed mean square error function is constructed to obtain the optimal coefficients. In the last part, a novel hybrid model combining the continuous and discrete time models is presented. It merges the flexibility of continuous time model and the accuracy of the discrete time model. The performance of choosing different models with different GP loss functions is also discussed. The effectiveness of the proposed PAGP methods is illustrated in our numerical section.
- North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
- North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
- North America > United States > Ohio (0.04)
- (2 more...)
- Energy (0.68)
- Government > Regional Government > North America Government > United States Government (0.46)
- Information Technology > Modeling & Simulation (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Why Business Executives Should Be Hip To ML Tools
I have spent most of my professional life in the age of AI and ML. During earlier times at Uber, I worked with models that estimated ETAs, calculated dynamic pricing and even matched riders with drivers. My co-founder Jason previously led video ad company TubeMogul (acquired by Adobe), which relied on ML to ensure that its advertisers didn't waste their media spend on ads that nobody saw, or ads that only bots saw. Although ride-sharing and video advertising aren't often used in the same sentence, both Jason and I faced similar challenges in ensuring that the models our companies deployed worked effectively and without bias. When models don't work as planned and machines, trained by data, make bad decisions, there is a direct impact on business results.